Fundamental assumptions of parametric models

Dr Jens Roeser

Learning outcomes

After completing this lecture, the workshop and your own reading you should be able to …

  • name the properties of the normal distribution,
  • explain the core assumptions of parametric tests,
  • and describe the essence of the central limit theorem.

Model assumptions

Reminder 1

Models are approximations of reality focusing on practically relevant aspects.

Reminder 2

“All models are wrong, but some are useful.” – George Box

  • Help us to understand a complex matter.
  • Ideally we want to be able to assess limitations to their usefulness.

(Statistical) Models are also machines

Rube-Goldberg machine: a machine intentionally designed to perform a simple task in an overly complicated way.

Why are the model assumptions important?

  • Machines need input.
  • Perform operations on input.
  • Always give some output.
  • Parametric statistical models make assumptions about the input they receive.
  • Reliability of output depends on the fit of input and assumptions.

What is a parametric model?

  • A family of probability distributions with a finite number of parameters.
  • E.g. normal distribution has two parameters: mean and standard deviation
  • Parametric models make assumptions about their input.
  • Normal distribution is entailed in t-test, ANOVA, linear regression
  • Non-parametric models do not make the same assumptions: e.g. Chi-squared [\(\chi^2\)] test, Mann Whitney U test, Spearman’s rank correlation

What do parametric models assume?

  • All parametric models make the same assumptions about their input.
  • Normal distribution is at the heart of parametric models
    • Interval / continuous data
    • Central limit theorem
    • Observations must be independent and identically (iid) for the central limit theorem to apply.
      • See also lecture and workshop week 6
  • Homogeneity of variance
  • Linearity (for continuous predictors in regression models)

What do parametric models assume?

  • Linearity
  • Independence
  • Normality
  • Equal variance (aka homogeneity)

Properties of the normal distribution

– aka the “bell curve”

Histograms

  • Counts / frequency of observations x.

Density plots

Density plots

  • Relative likelihood of x taking on a certain value.
  • The normal distribution is defined by its density function.
  • We don’t need to worry about the maths here.

Symmetric

  • Left and right half are mirror images of each other
  • Mean = Mode = Median

Tails never hit zero

Characterised by mean and standard deviation

Characterised by mean and standard deviation

Characterised by mean and standard deviation

Aside: standard normal distribution is mean = 0 and SD = 1

x is continuous

  • y is defined for every value of x
  • Ranges from -\(\infty\) to +\(\infty\)
  • Non-continuous (discrete): binary outcomes, count data, ordinal, psychometric scales

Area under the curve is 1 (=100%)

Area under the curve is 1 (=100%)

  • 68% within 1 SD

Area under the curve is 1 (=100%)

  • 95% within 2 SDs

Area under the curve is 1 (=100%)

  • 99.7% within 3 SDs

Example: intelligence quotient

  • Normal distribution in practice.
  • Total score of standardised tests to assess human intelligence
  • Defined as normal distributed with population values mean = 100, SD = 15
  • \(\sim\) 2/3 between 85 and 115
  • 2.5% \(>\) 130 (gifted)
  • 2.5% \(<\) 70 (impaired)

Example: IQ

  • Each person has individual unknown IQ value.
  • IQ tests aim to estimate this quantity.
  • Intelligence is abstract by nature and can’t be measured objectively unlike distance, mass, income

Example: IQ

Country mean IQ
Hong Kong 107
Korea South 106
Japan 105
Taiwan 104
Singapore 103
Austria 102
Germany 102
Italy 102
  • In practice the IQ test might be biased economically and culturally
  • IQ scores from 80 countries taken from Gill (2014; p. 85-86; data from Lynn & Vanhanen, 2001)

Example: IQ

Country mean IQ
Hong Kong 107
Korea South 106
Japan 105
Taiwan 104
Singapore 103
Austria 102
Germany 102
Italy 102
Country mean IQ
Ethiopia 63
Sierra Leone 64
Congo Zaire 65
Guinea 66
Zimbabwe 66
Nigeria 67
Ghana 71
Jamaica 72

Example for non-normal responses

  • Psychometric scales are neither continuous nor linear (see intro of Bürkner & Vuorre, 2019).


Source: Robinson (2018)

Example for non-normal responses

Psychometric scale; see Robinson (2018)

  • Response categories
  • Limited discrete options (vs sliders)
  • Ordinal: implicit order
  • Not equidistant (vs, say, inch)
  • See Liddell & Kruschke (2018)
  • We will see why the use of lms is not unjustified.

Caveats of normal distributions

  • Strictly speaking, nothing is really normal distributed.
  • Most variables have an upper and lower bound, e.g., people can’t be fast than 0 secs or smaller than 0 inch.
  • All observations are discrete in practice due to limitations of our measuring instruments.
  • However, a normal distribution is often suitable for practical considerations.
  • So we typically want data to be distributed approximately normal.

Interim summary

  • Parametric models assume that the data are normal distributed.
  • However, psychologists often obtain non-normal distributed data.
  • Why do we bother with the normal distribution?
  • We will see in the following that the data don’t need to come from a normal distribution at all.
  • The reason is the central limit theorem.

Recap questions

  • What is a parametric model?
  • What are the assumptions of parametric models?
  • What are the properties of the normal distribution?
  • What are the properties of a continuous variable?
  • Why are psychometric data (often) not continuous / normal distributed?

Central limit theorem (CLT)

  • The sampling distribution will be approximately normal for large sample sizes, regardless of the (type / shape of the) distribution which we are sampling from.
  • We can use parametric statistical inference even if we are sampling from a population that is weird (i.e. not normal distributed).
  • From week 6: mean of sampling distribution is estimate of population mean (\(\mu\); Greek mu)
  • Works also for totals (e.g. IQ), SDs, etc.

Demo of CLT

  • CES-D scale: self-report depression (Radloff, 1977)
  • 22 items to assess the degree of depression
  • 5-point Likert scale: Strongly disagree - Strongly agree
  • Item 1: I was bothered by things that usually don’t bother me.
  • Item 2: I had a poor appetite.
  • Item 3: I did no feel like eating, even though I should have been hungry.
  • Item 22: I didn’t enjoy life.

Demo of CLT

N_items <- 22 # 22 items
response_options <- 1:5 # 5-point Likert scale
response_options
[1] 1 2 3 4 5
  • 5-point Likert: strongly disagree (1) – strongly agree (5)

Simulate one participant

ppt_1 <- sample(response_options, N_items, replace = T)
ppt_1
 [1] 1 4 4 4 3 4 3 2 4 1 5 2 2 3 1 3 3 5 2 1 5 4

Simulate one participant

  • Data are not normal (discrete, options 1-5, not symmetric)
  • Total count of 22 items

Repeat for another participant

(ppt_2 <- sample(response_options, N_items, replace = T))
 [1] 3 1 1 3 4 3 5 1 3 2 5 5 4 2 3 4 2 3 4 5 2 1

Calculate means for each participant

mean(ppt_1); mean(ppt_2)
[1] 3
[1] 3
  • The sample distribution will approach normality as the number of participants increases.

Repeat for 10 participants

Repeat for 100 participants

Repeat for 1,000 participants

Repeat for 10,000 participants

Demo of CLT

  • The magic: we’ve sampled from discrete data but, using sample means, arrived at a normal distribution
  • CLT: distribution of sample means approaches normality as the number of participants increases.
  • Sample size is the crux.
  • iid applies (independent and identically distributed)

Independent and identically distributed (iid)

  • Most fundamental assumptions for the CLT and therefore statistical tests.
  • Sampling / obtaining of the data.
  • Sample is iid if each observation comes from the same distribution as the others and all observations are mutually independent.

Independence

  • One observation must be unrelated from the next.
  • Assessing the spread of COVID infections: sample only one person per house hold

Network example

Independence

  • Self-report depression
  • Item 1: I was bothered by things that usually don’t bother me.
  • Item 2: I had a poor appetite.
  • Item 3: I did no feel like eating, even though I should have been hungry.
  • Different questions related to the same psych phenomenon.
  • Violations:
    • repeating the same questions
    • testing the same people multiple times
    • not randomising the presentation order
  • Consequence:
    • Unreliable / biased results

Identical distribution

  • Observations must come from the same distribution
  • or family of distributions: e.g. normal, count, ordinal, binary
  • Depression example: 22 items about depression, all 5-point Likert scale

Identical distribution

  • Violation:
    • measuring responses on different scales (6-point Likert, continuous scale)
    • studying the effect of snapchat on self esteem but including people without snapchat
    • asking questions about coffee preference to measure depression

Identical distribution

  • Self-report depression items
  • Item 1: I was bothered by things that usually don’t bother me.
  • Item 2: Flat white is too bitter.
  • Item 3: I did no feel like eating, even though I should have been hungry.

Recap questions

  • What is the role of the CLT in the context of normal distributions?
  • What is iid?
  • Why is sample size important in the context of model assumptions?

Epilogue

Summary

  • Parametric models expect data with certain properties.
  • Violations of parametric assumptions lead to unreliable results.
  • Normal distribution is in the centre of parametric models.
  • The normal distribution can be characterised using a range of properties.
  • Requirements for normal distribution:
    • large enough sample (CLT applies)
    • independent and identically distributed

Useful textbook resources

  • Field et al. (2012) Chapter 5 (with R code)
  • Chapter 12.2 here
  • Baguley (2012) Chapter 9 (with R code)
  • Matloff (2019) Chapter 8 and 9 (also 7) (with R code)
  • Coolican (2018) Chapter 17 (pages 483–486)
  • Howitt & Cramer (2014) Chapter 5

Outlook

  • Workshop on normal distribution and CLT
  • Model evaluation and violations (lecture + workshop)

References

Baguley, T. (2012). Serious stats: A guide to advanced statistics for the behavioral sciences. Macmillan International Higher Education.

Bürkner, P.-C., & Vuorre, M. (2019). Ordinal regression models in psychology: A tutorial. Advances in Methods and Practices in Psychological Science, 2(1), 77–101.

Coolican, H. (2018). Research methods and statistics in psychology. Routledge.

Field, A., Miles, J., & Field, Z. (2012). Discovering statistics using R. Sage publications.

Gill, J. (2014). Bayesian methods: A social and behavioral sciences approach (Vol. 20). CRC press.

Howitt, D., & Cramer, D. (2014). Introduction to statistics in psychology (6th ed.). Pearson Education.

Liddell, T. M., & Kruschke, J. K. (2018). Analyzing ordinal data with metric models: What could possibly go wrong? Journal of Experimental Social Psychology, 79, 328–348.

Lynn, R., & Vanhanen, T. (2001). National IQ and economic development: A study of eighty-one nations. Mankind Quarterly, 41(4), 415–435.

Matloff, N. (2019). Probability and statistics for data science: Math + R + data. CRC Press.

Radloff, L. S. (1977). The CES-D scale: A self-report depression scale for research in the general population. Applied Psychological Measurement, 1(3), 385–401.

Robinson, M. A. (2018). Using multi-item psychometric scales for research and practice in human resource management. Human Resource Management, 57(3), 739–750.